12 research outputs found

    Invariant Causal Prediction for Nonlinear Models

    Full text link
    An important problem in many domains is to predict how a system will respond to interventions. This task is inherently linked to estimating the system's underlying causal structure. To this end, Invariant Causal Prediction (ICP) (Peters et al., 2016) has been proposed which learns a causal model exploiting the invariance of causal relations using data from different environments. When considering linear models, the implementation of ICP is relatively straightforward. However, the nonlinear case is more challenging due to the difficulty of performing nonparametric tests for conditional independence. In this work, we present and evaluate an array of methods for nonlinear and nonparametric versions of ICP for learning the causal parents of given target variables. We find that an approach which first fits a nonlinear model with data pooled over all environments and then tests for differences between the residual distributions across environments is quite robust across a large variety of simulation settings. We call this procedure "invariant residual distribution test". In general, we observe that the performance of all approaches is critically dependent on the true (unknown) causal structure and it becomes challenging to achieve high power if the parental set includes more than two variables. As a real-world example, we consider fertility rate modelling which is central to world population projections. We explore predicting the effect of hypothetical interventions using the accepted models from nonlinear ICP. The results reaffirm the previously observed central causal role of child mortality rates

    Conditional variance penalties and domain shift robustness

    No full text
    When training a deep neural network for image classification, one can broadly distinguish between two types of latent features of images that will drive the classification. We can divide latent features into (i) ‘core’ or ‘conditionally invariant’ features C whose distribution C| Y, conditional on the class Y, does not change substantially across domains and (ii) ‘style’ features S whose distribution S| Y can change substantially across domains. Examples for style features include position, rotation, image quality or brightness but also more complex ones like hair color, image quality or posture for images of persons. Our goal is to minimize a loss that is robust under changes in the distribution of these style features. In contrast to previous work, we assume that the domain itself is not observed and hence a latent variable. We do assume that we can sometimes observe a typically discrete identifier or “ID variable”. In some applications we know, for example, that two images show the same person, and ID then refers to the identity of the person. The proposed method requires only a small fraction of images to have ID information. We group observations if they share the same class and identifier (Y, ID) = (y, id) and penalize the conditional variance of the prediction or the loss if we condition on (Y, ID). Using a causal framework, this conditional variance regularization (CoRe) is shown to protect asymptotically against shifts in the distribution of the style variables in a partially linear structural equation model. Empirically, we show that the CoRe penalty improves predictive accuracy substantially in settings where domain changes occur in terms of image quality, brightness and color while we also look at more complex changes such as changes in movement and posture

    Invariant Causal Prediction for Nonlinear Models

    No full text
    An important problem in many domains is to predict how a system will respond to interventions. This task is inherently linked to estimating the system’s underlying causal structure. To this end, Invariant Causal Prediction (ICP) [1] has been proposed which learns a causal model exploiting the invariance of causal relations using data from different environments. When considering linear models, the implementation of ICP is relatively straightforward. However, the nonlinear case is more challenging due to the difficulty of performing nonparametric tests for conditional independence
    corecore